<!DOCTYPE html>
<html>
<head>
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" href="https://best.openssf.org/assets/css/style.css">
<link rel="stylesheet" href="checker.css">
<script src="checker.js"></script>
<script src="regex1.js"></script>
<link rel="license" href="https://creativecommons.org/licenses/by/4.0/">

<!-- See create_labs.md for how to create your own lab! -->

</head>
<body>
<!-- For GitHub Pages formatting: -->
<div class="container-lg px-3 my-5 markdown-body">
<h1>Lab Exercise regex1</h1>
<p>
This is a lab exercise on developing secure software.
For more information, see the <a href="introduction.html" target="_blank">introduction to
the labs</a>.

<p>
<h2>Goal</h2>
<p>
<b>Learn how to create simple regular expressions for input validation.</b>

<p>
<h2>Background</h2>
<p>
Regular expressions (regexes) are a widely-used notation for
expressing text patterns.
Regexes can be used to validate input; when used correctly,
they can counter many attacks.

<p>
Different regex languages have slightly different notations,
but they have much in common. Here are some basic rules for regex
notations:

<ol>
<li>The most trivial rule is that a letter or digit matches itself. That is, the regex “<tt>d</tt>” matches the letter “<tt>d</tt>”. Most implementations use case-sensitive matches by default, and that is usually what you want.
<li>Another rule is that square brackets surround a rule that specifies any of a number of characters. If the square brackets surround just alphanumerics, then the pattern matches any of them. So <tt>[brt]</tt> matches a single “<tt>b</tt>”, “<tt>r</tt>”, or “<tt>t</tt>”.
Inside the brackets you can include
ranges of symbols separated by dash ("-"), so
<tt>[A-D]</tt> will match one character, which can be one A, one B, one C,
or one D.
You can do this more than once.
For example,
the term <tt>[A-Za-z]</tt> will match one character, which can be
an uppercase Latin letter or a lowercase Latin letter.
(This text assumes you're not using a long-obsolete character system
like EBCDIC.)
<li>If you follow a pattern with “<tt>&#42;</tt>”, that means
“<i>0 or more times</i>”.
In almost all regex implementations (except POSIX BRE),
following a pattern with "<tt>+</tt>" means "<i>1 or more times</i>".
So <tt>[A-D]*</tt> will match 0 or more letters as long as every letter
is an A, B, C, or D.
<li>You can use "<tt>|</tt>" to identify options, any of which are acceptable.
When validating input, you should surround the collection of options
with parenthesis, because "<tt>|</tt>" has a low precedence.
So for example, "<tt>(yes|no)</tt>" is a way to match either "yes" or "no".
</ol>

<p>
<h2>Task Information</h2>
<p>
We want to use regexes to <i>validate</i> input.
That is, the input should <i>completely</i> match the regex pattern.
In regexes you can do this by using its default mode
(not a "multiline" mode), prepending some symbol, and appending a different
symbol.
Unfortunately, different platforms use different regex
symbols for performing a complete match to an input.
The following table shows a summarized version of
what you should prepend and append for
many different platforms (for their default regex system).

<p>
<table>
<tr>
<th>Platform
<th>Prepend
<th>Append
<tr>
<td>POSIX BRE, POSIX ERE, and ECMAScript (JavaScript)
<td>“^”
<td>“$”
<tr>
<td>Java, .NET, PHP, Perl, and PCRE
<td>“^” or “\A”
<td>“\z”
<tr>
<td>Golang, Rust crate regex, and RE2
<td>“^” or “\A”
<td>“$” or “\z”
<tr>
<td>Python
<td>“^” or “\A”
<td>“\Z” (not “\z”)
<tr>
<td>Ruby
<td>“\A”
<td>“\z”
</table>

<p>
For example, to validate in ECMAScript (JavaScript)
that an input is must be either “<tt>ab</tt>” or “<tt>de</tt>”,
use the regex “<tt>^(ab|de)$</tt>”.
To validate the same thing in Python, use
“<tt>^(ab|de)\Z</tt>” or “<tt>\A(ab|de)\Z</tt>” (note that the
regex pattern is slightly different).

<p>
More information is available in the OpenSSF guide
<a href="https://best.openssf.org/Correctly-Using-Regular-Expressions"
>Correctly Using Regular Expressions for Secure Input Validation</a>.

<p>
<h2>Interactive Lab (<span id="grade"></span>)</h2>

<b>Please create regular expression (regex) patterns
that meet the criteria below.</b>

<p>
Use the “hint” and “give up” buttons if necessary.

<h3>Part 1</h3>
<p>
Create a regular expression, for use in ECMAScript (JavaScript),
that only matches the letters "Y" or "N".
<!--
You can use this an example for new labs.
For multi-line inputs, instead of <input id="attempt0" type="text" ...>, use
<textarea id="attempt" rows="2" cols="65">...</textarea>
-->
<form id="lab1"><pre><code
><input id="attempt0" type="text" size="60" spellcheck="false" value="">
</code></pre>
<button type="button" class="hintButton">Hint</button>
<button type="button" class="resetButton">Reset</button>
<button type="button" class="giveUpButton">Give up</button>
</form>

<h3>Part 2</h3>
<p>
Create a regular expression, for use in ECMAScript (JavaScript),
that only matches one or more uppercase Latin letters (A through Z).
<!--
You can use this an example for new labs.
For multi-line inputs, instead of <input id="attempt0" type="text" ...>, use
<textarea id="attempt" rows="2" cols="65">...</textarea>
-->
<form id="lab2">
<pre><code
><input id="attempt1" type="text" size="60" spellcheck="false" value="">
</code></pre>
<button type="button" class="hintButton">Hint</button>
<button type="button" class="resetButton">Reset</button>
<button type="button" class="giveUpButton">Give up</button>
</form>

<h3>Part 3</h3>
<p>
Create a regular expression, for use in ECMAScript (JavaScript),
that only matches the words "true" or "false".
<!--
You can use this an example for new labs.
For multi-line inputs, instead of <input id="attempt0" type="text" ...>, use
<textarea id="attempt" rows="2" cols="65">...</textarea>
-->
<form id="lab3">
<pre><code
><input id="attempt2" type="text" size="60" spellcheck="false" value="">
</code></pre>
<button type="button" class="hintButton">Hint</button>
<button type="button" class="resetButton">Reset</button>
<button type="button" class="giveUpButton">Give up</button>
</form>

<h3>Part 4</h3>
<p>
Create a regular expression
that only matches one or more uppercase Latin letters (A through Z).
However, this time, do it for Python (not JavaScript).
<!--
You can use this an example for new labs.
For multi-line inputs, instead of <input id="attempt0" type="text" ...>, use
<textarea id="attempt" rows="2" cols="65">...</textarea>
-->
<form id="lab4">
<pre><code
><input id="attempt3" type="text" size="60" spellcheck="false" value="">
</code></pre>
<button type="button" class="hintButton">Hint</button>
<button type="button" class="resetButton">Reset</button>
<button type="button" class="giveUpButton">Give up</button>
</form>

<h3>Part 5</h3>
<p>
Create a regular expression
that only matches one Latin letter (A through Z), followed by a
dash ("-"), followed by one or more digits.
This time, do it for Ruby (not JavaScript or Python).
<!--
You can use this an example for new labs.
For multi-line inputs, instead of <input id="attempt0" type="text" ...>, use
<textarea id="attempt" rows="2" cols="65">...</textarea>
-->
<form id="lab5">
<pre><code
><input id="attempt4" type="text" size="60" spellcheck="false" value="">
</code></pre>
<button type="button" class="hintButton">Hint</button>
<button type="button" class="resetButton">Reset</button>
<button type="button" class="giveUpButton">Give up</button>
</form>

<br><br>
<p>
<i>This lab was developed by David A. Wheeler at
<a href="https://www.linuxfoundation.org/"
>The Linux Foundation</a>.</i>
<br><br><!-- These go in the last form if there's more than one: -->
<p id="correctStamp" class="small">
<textarea id="debugData" class="displayNone" rows="20" cols="65" readonly>
</textarea>

</div><!-- End GitHub pages formatting -->
</body>
</html>
